Last updated: 2020-11-09

Checks: 7 0

Knit directory: T47D_ZR75_DHT_StrippedSerum_RNASeq/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(12345) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version fda3bb2. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .snakemake/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/qc_raw.Rmd) and HTML (docs/qc_raw.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd fda3bb2 Steve Ped 2020-11-09 Rewrote qc_raw
html ea635ab Steve Ped 2020-11-09 Build site.
Rmd 8d91bfc Steve Ped 2020-11-09 Rewrote qc_raw
html 3502d11 Steve Ped 2020-11-09 Build site.
Rmd 9d03800 Steve Ped 2020-11-09 Added tabsets for both runs
Rmd 6f7af53 Steve Pederson 2020-11-06 Initial Commit

library(ngsReports)
library(tidyverse)
library(yaml)
library(scales)
library(pander)
library(glue)
library(plotly)
panderOptions("table.split.table", Inf)
panderOptions("big.mark", ",")
theme_set(theme_bw())
config <- here::here("config/config.yml") %>%
  read_yaml()
suffix <- paste0(config$tags$tag, config$ext)
sp <- config$ref$species %>%
  str_replace("(^[a-z])[a-z]*_([a-z]+)", "\\1\\2") %>%
  str_to_title()
samples <- config$samples %>%
  here::here() %>%
  read_tsv() %>%
  mutate(
    R1 = paste0(sample, config$tags$r1, suffix),
    R2 = paste0(sample, config$tags$r2, suffix),
  ) %>%
  pivot_longer(
    cols = c("R1", "R2"),
    names_to = "Reads",
    values_to = "Filename"
  )
config$analysis <- config$analysis %>%
  lapply(intersect, y = colnames(samples)) %>%
  .[vapply(., length, integer(1)) > 0]
if (length(config$analysis)) {
  samples <- samples %>%
    unite(
      col = group, 
      any_of(as.character(unlist(config$analysis))), 
      sep = "_", remove = FALSE
    )
} else {
  samples$group <- samples$Filename
}
group_cols <- hcl.colors(
  n = length(unique(samples$group)), 
  palette = "Zissou 1"
  ) %>%
  setNames(unique(samples$group))
fh <- round(6 + nrow(samples) / 7, 0)

Quality Assessment on Raw Data

rawFqc <- here::here("data/raw/FastQC") %>%
  list.files(pattern = "fastqc.zip", full.names = TRUE, recursive = TRUE) %>%
  FastqcDataList() %>%
  .[fqName(.) %in% samples$Filename]
for (i in seq_along(rawFqc)){
  run <- str_extract(path(rawFqc[[i]]), "Hiseq_[12]")
  rawFqc[[i]]@Summary$Filename <- paste(run, rawFqc[[i]]@Summary$Filename, sep = "/")
} 

FastQC Summary

plotSummary(rawFqc)
*Overall summary of FastQC reports*

Overall summary of FastQC reports

Version Author Date
ea635ab Steve Ped 2020-11-09
3502d11 Steve Ped 2020-11-09

Library Sizes

A total of 64 libraries were contained in this dataset, with read totals ranging between 5,699,001 and 28,623,352 reads.

Across all libraries, reads were between 100 and 99 bases. This does indicate some read trimmig had been performed prior to that undertaken here.

plotReadTotals(rawFqc, pattern = suffix, usePlotly = TRUE)

Library Sizes for all supplied fastq files. Any samples run as multiple libraries are shown as the supplied multiple libraries and have not been merged.

Sequence Quality

plotBaseQuals(
  rawFqc,
  pattern = suffix, 
  usePlotly = TRUE,
  dendrogram = TRUE,
  cluster = TRUE
  )

Mean sequencing quality scores at each base position for each library

GC Content

plotGcContent(
  x = rawFqc, 
  pattern = suffix, 
  species = sp, 
  gcType = "Trans",
  usePlotly = TRUE,
  dendrogram = TRUE,
  cluster = TRUE
  )

GC content shown as the % above and below the theoretical GC content for the Hsapiens transcriptome.

ggplotly(
  getModule(rawFqc, "Per_sequence_GC_content") %>%
    group_by(Filename) %>%
    mutate(
      cumulative = cumsum(Count) / sum(Count)
    ) %>%
    ungroup() %>%
    separate(Filename, into = c("Run", "Filename"), sep = "/") %>%
    left_join(samples) %>%
    bind_rows(
      getGC(gcTheoretical, sp, "Trans") %>%
        mutate_at(sp, cumsum) %>% 
        rename_all(
          str_replace_all, 
          pattern = sp, replacement = "cumulative",
        ) %>%
        mutate(
          Filename = "Theoretical GC",
          group = Filename
        )
    ) %>%
    unite(Filename, Run, Filename, sep = "/") %>%
    mutate(
      group = as.factor(group),
      group = relevel(group, ref = "Theoretical GC"),
      cumulative = round(cumulative*100, 2)
    ) %>%
    ggplot(aes(GC_Content, cumulative, group = Filename)) +
    geom_line(aes(colour = group), size = 1/3) +
    scale_x_continuous(label = ngsReports:::.addPercent) +
    scale_y_continuous(label = ngsReports:::.addPercent) +
    scale_colour_manual(
      values = c("#000000", group_cols)
    ) +
    labs(
      x = "GC Content",
      y = "Cumulative Total",
      colour = "Group"
    )
)

GC content shown as a cumulative distribution for all libraries. Groups can be hidden by clicking on them in the legend.

Sequence Content

plotly::ggplotly(
  getModule(rawFqc, module = "Per_base_sequence_content") %>% 
    mutate(Base = fct_inorder(Base)) %>%
    group_by(Base) %>% 
    mutate(
      across(c("A", "C", "G", "T"), function(x){x - mean(x)}) 
    ) %>% 
    pivot_longer(
      cols = c("A", "C", "G", "T"), 
      names_to = "Nuc", 
      values_to = "resid"
    ) %>%
    separate(Filename, into = c("Run", "Filename"), sep = "/") %>%
    left_join(samples) %>%
    unite(Filename, Run, Filename, sep = "/") %>%
    ggplot(
      aes(Base, resid, group = Filename, colour = group)
    ) + 
    geom_line() +
    facet_wrap(~Nuc) + 
    scale_colour_manual(values = group_cols) +
    labs(
      x = "Read Position", y = "Residual", colour = "Group"
    )
)

Base and Position specific residuals for each sample. The mean base content at each position was calculated for each nucleotide, and the sample-specific residuals calculated.

AdapterContent

plotAdapterContent(
  x = rawFqc, 
  pattern = suffix, 
  usePlotly = TRUE,
  dendrogram = TRUE,
  cluster = TRUE
  )

Total Adapter Content for each sample shown by starting position in the read.

Overrepresented Sequences

os <- suppressMessages(getModule(rawFqc, "Over"))
os_fh <- max(20, 6 + nrow(os) / 20)
if (nrow(os)){
  if (length(unique(os$Filename)) > 1){
    suppressMessages(
      plotOverrep(
        x = rawFqc,
        pattern = suffix, 
        usePlotly = TRUE,
        dendrogram = TRUE,
        cluster = TRUE
      )
    )
  }
}

Summary of over-represented sequences across all libraries

os %>%
  group_by(Sequence, Possible_Source) %>%
  summarise(
    `Found in` = n(),
    Total = sum(Count),
    `Largest Percent` = glue("{round(max(Percentage), 2)}%")
  ) %>%
  arrange(desc(Total)) %>%
  pander(
    caption = "*Summary of over-represented sequences within the raw data.*"
  )
Summary of over-represented sequences within the raw data.
Sequence Possible_Source Found in Total Largest Percent
CCTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATG No Hit 32 2,775,012 0.88%
CTGGAGTCTTGGAAGCTTGACTACCCTACGTTCTCCTACAAATGGACCTT No Hit 32 1,869,452 0.64%
CTCCGTTTCCGACCTGGGCCGGTTCACCCCTCCTTAGGCAACCTGGTGGT No Hit 32 1,667,442 0.53%
CTCCTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGA No Hit 32 1,551,495 0.63%
CCCCTCCTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATAT No Hit 32 1,505,323 0.53%
CCCTCCTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATT No Hit 32 1,382,213 0.5%
CTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATGC No Hit 30 1,173,525 0.47%
CCCAAACCCACTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTAC No Hit 16 1,082,742 0.56%
CTTGGTTATAATTTTTCATCTTTCCCTTGCGGTACTATATCTATTGCGCC No Hit 30 1,075,631 0.51%
CTCTCTACAAGGTTTTTTCCTAGTGTCCAAAGAGCTGTTCCTCTTTGGAC No Hit 32 1,048,457 0.35%
CCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGATCA No Hit 26 1,042,034 0.53%
CCTCCTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTG No Hit 32 989,640 0.31%
CTTATTTCTCTTGTCCTTTCGTACAGGGAGGAATTTGAAGTAGATAGAAA No Hit 32 925,237 0.35%
CTGAACTCCTCACACCCAATTGGACCAATCTATCACCCTATAGAAGAACT No Hit 32 893,807 0.3%
CTGGGCTGTAGTGCGCTATGCCGATCGGGTGTCCGCACTAAGTTCGGCAT No Hit 30 830,226 0.26%
CTCTAGAATAGGATTGCGCTGTTATCCCTAGGGTAACTTGTTCCGTTGGT No Hit 31 828,390 0.27%
CGGGGGAAGGCGCTTTGTGAAGTAGGCCTTATTTCTCTTGTCCTTTCGTA No Hit 32 800,786 0.37%
CCCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTT No Hit 32 797,383 0.25%
CTCCGAGGTCGCCCCAACCGAAATTTTTAATGCAGGTTTGGTAGTTTAGG No Hit 32 778,604 0.25%
CTGGTTTCGGGGGTCTTAGCTTTGGCTCTCCTTGCAAAGTTATTTCTAGT No Hit 30 777,092 0.29%
CTTGAGTCCAGGAGTTCTGGGCTGTAGTGCGCTATGCCGATCGGGTGTCC No Hit 30 768,861 0.23%
CGTTGGTCAAGTTATTGGATCAATTGAGTATAGTAGTTCGCTTTGACTGG No Hit 30 740,767 0.22%
CCCAAACCCACTCCACCCTACTACCAGACAACCTTAGCCAAACCATTTAC No Hit 16 729,714 0.4%
GTTCTGGGCTGTAGTGCGCTATGCCGATCGGGTGTCCGCACTAAGTTCGG No Hit 30 725,496 0.23%
CTCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGATC No Hit 25 719,900 0.27%
CAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGATCAG No Hit 25 699,697 0.27%
CTTTAATCGTTGAACAAACGAACCTTTAATAGCGGCTGCACCATCGGGAT No Hit 31 691,249 0.21%
CTACAATCAACCAACAAGTCATTATTACCCTCACTGTCAACCCAACACAG No Hit 32 664,586 0.21%
CCCTGTTCTTGGGTGGGTGTGGGTATAATGCTAAGTTGAGATGATATCAT No Hit 15 659,116 0.47%
CCCTGTTCTTGGGTGGGTGTGGGTATAATACTAAGTTGAGATGATATCAT No Hit 15 641,218 0.4%
GTCCAATTGGGTGTGAGGAGTTCAGTTATATGTTTGGGATTTTTTAGGTA No Hit 25 620,097 0.29%
CTCTAAATCCCCTTGTAAATTTAACTGTTAGTCCAAAGAGGAACAGCTCT No Hit 28 617,871 0.23%
CTCATTTGGATGTGTCTGGAGTCTTGGAAGCTTGACTACCCTACGTTCTC No Hit 29 610,661 0.24%
CCTCGATGTTGGATCAGGACATCCCGATGGTGCAGCCGCTATTAAAGGTT No Hit 29 604,876 0.19%
GTATAATACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTG No Hit 16 600,508 0.44%
CGCTTGAGTCCAGGAGTTCTGGGCTGTAGTGCGCTATGCCGATCGGGTGT No Hit 29 589,770 0.21%
CTCAGACCGCGTTCTCTCCCTCTCACTCCCCAATACGGAGAGAAGAACGA No Hit 23 589,407 0.31%
GCTACCTTTGCACGGTTAGGGTACCGCGGCCGTTAAACATGTGTCACTGG No Hit 23 564,289 0.21%
CACCAGGTTGCCTAAGGAGGGGTGAACCGGCCCAGGTCGGAAACGGAGCA No Hit 26 551,677 0.21%
GTGGGTATAATGCTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTT No Hit 15 547,997 0.37%
CTTTCGTACAGGGAGGAATTTGAAGTAGATAGAAACCGACCTGGATTACT No Hit 26 543,299 0.22%
CGCGTGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCT No Hit 27 517,875 0.17%
GTAAGATTTGCCGAGTTCCTTTTACTTTTTTTAACCTTTCCTTATGGGCA No Hit 15 501,227 0.33%
CTGCTCCGTTTCCGACCTGGGCCGGTTCACCCCTCCTTAGGCAACCTGGT No Hit 24 490,368 0.21%
CCTTGCTATATTATGCTTGGTTATAATTTTTCATCTTTCCCTTGCGGTAC No Hit 21 483,807 0.23%
CCCCAACCGAAATTTTTAATGCAGGTTTGGTAGTTTAGGACCTGTGGGTT No Hit 25 483,472 0.2%
CTGCTGTTTCCCGTGGGGGTGTGGCTAGGCTAAGCGTTTTGAGCTGCATT No Hit 21 480,245 0.24%
GGGTATAATACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTG No Hit 14 475,773 0.32%
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN No Hit 10 460,429 0.43%
GTGCTCTTTTAGCTGTTCTTAGGTAGCTCGTCTGGTTTCGGGGGTCTTAG No Hit 21 460,116 0.21%
GTCTGGAGTCTTGGAAGCTTGACTACCCTACGTTCTCCTACAAATGGACC No Hit 23 456,100 0.2%
GGTGGCGCGTGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCTGGAGGA No Hit 23 447,313 0.18%
CCTAGCCTTTCTATTAGCTCTTAGTAAGATTACACATGCAAGCATCCCCG No Hit 24 433,607 0.15%
GTCCTTTCGTACAGGGAGGAATTTGAAGTAGATAGAAACCGACCTGGATT No Hit 20 431,960 0.18%
CTTTCCTTATGGGCATGCCTGTGTTGGGTTGACAGTGAGGGTAATAATGA No Hit 15 420,384 0.25%
CGAGGGTTCAGCTGTCTCTTACTTTTAACCAGTGAAATTGACCTGCCCGT No Hit 22 408,193 0.14%
CAACAATAGGGTTTACGACCTCGATGTTGGATCAGGACATCCCGATGGTG No Hit 25 408,185 0.14%
CTGTTCTTGGGTGGGTGTGGGTATAATGCTAAGTTGAGATGATATCATTT No Hit 15 408,168 0.23%
CTCAGATCACGTAGGACTTTAATCGTTGAACAAACGAACCTTTAATAGCG No Hit 22 397,779 0.19%
CTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTACCCAAATAAAG No Hit 16 397,140 0.21%
CTCGGAGGTTGGGTTCTGCTCCGAGGTCGCCCCAACCGAAATTTTTAATG No Hit 21 386,488 0.19%
CAGGAGTTCTGGGCTGTAGTGCGCTATGCCGATCGGGTGTCCGCACTAAG No Hit 22 374,557 0.14%
CTTAGCCAAACCATTTACCCAAATAAAGTATAGGCGATAGAAATTGAAAC No Hit 17 358,177 0.18%
CTGAGTTCAGACCGGAGTAATCCAGGTCGGTTTCTATCTACTTCAAATTC No Hit 20 352,904 0.15%
CTCGTCTGGTTTCGGGGGTCTTAGCTTTGGCTCTCCTTGCAAAGTTATTT No Hit 18 345,496 0.2%
CTCGCTATGTTGCTCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATC No Hit 19 333,588 0.15%
GTAAGATTTGCCGAGTTCCTTTTACTTTTTTTAACCTTTCCTTATGAGCA No Hit 13 327,980 0.22%
CTAGAATAGGATTGCGCTGTTATCCCTAGGGTAACTTGTTCCGTTGGTCA No Hit 18 320,887 0.18%
CACTGGGCAGGCGGTGCCTCTAATACTGGTGATGCTAGAGGTGATGTTTT No Hit 16 317,085 0.19%
ACCCACTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTACCCAAA No Hit 16 307,721 0.17%
CCAAGATAGAATCTTAGTTCAACTTTAAATTTGCCCACAGAACCCTCTAA No Hit 16 302,236 0.2%
CCTGTGTTGGGTTGACAGTGAGGGTAATAATGACTTGTTGGTTGATTGTA No Hit 15 300,644 0.19%
CCCAATTGGACCAATCTATCACCCTATAGAAGAACTAATGTTAGTATAAG No Hit 17 288,272 0.14%
GATAGATTGGTCCAATTGGGTGTGAGGAGTTCAGTTATATGTTTGGGATT No Hit 14 286,684 0.21%
CCCAGCTACTCGGGAGGCTGAGGTGGGAGGATCGCTTGAGCCCAGGAGTT No Hit 14 285,666 0.18%
TGAAAACATTCTCCTCCGCATAAGCCTGCGTCAGATTAAGACACTGAACT No Hit 16 283,030 0.15%
CTGTTCTTGGGTGGGTGTGGGTATAATACTAAGTTGAGATGATATCATTT No Hit 11 278,384 0.27%
CTGGAGGATCGCTTGAGTCCAGGAGTTCTGGGCTGTAGTGCGCTATGCCG No Hit 18 276,977 0.12%
CTGGCTGCGACATCTGTCACCCCATTGATCGCCAGGGTTGATTCGGCTGA No Hit 16 276,963 0.14%
CTTTCCTTATGAGCATGCCTGTGTTGGGTTGACAGTGAGGGTAATAATGA No Hit 13 269,076 0.18%
GGGTATAATGCTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTG No Hit 10 261,648 0.23%
CCACTATTTTGCTACATAGACGGGTGTGCTCTTTTAGCTGTTCTTAGGTA No Hit 16 252,220 0.14%
GGCAAATTTAAAGTTGAACTAAGATTCTATCTTGGACAACCAGCTATCAC No Hit 12 250,692 0.17%
GTTAATTGTCAGTTCAGTGTCTTAATCTGACGCAGGCTTATGCGGAGGAG No Hit 11 239,756 0.23%
CCCAACACAGGCATGCCCATAAGGAAAGGTTAAAAAAAGTAAAAGGAACT No Hit 14 235,164 0.14%
CTTACTTTTAACCAGTGAAATTGACCTGCCCGTGAAGAGGCGGGCATGAC No Hit 14 235,010 0.14%
CAAACCCACTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTACCC No Hit 12 227,145 0.14%
CCTAGGGTAACTTGTTCCGTTGGTCAAGTTATTGGATCAATTGAGTATAG No Hit 12 224,836 0.15%
CCCTAGGGTAACTTGTTCCGTTGGTCAAGTTATTGGATCAATTGAGTATA No Hit 13 220,957 0.16%
GCTGCTTTTAGGCCTACTATGGGTGTTAAATTTTTTACTCTCTCTACAAG No Hit 12 213,591 0.14%
CGTTAAACATGTGTCACTGGGCAGGCGGTGCCTCTAATACTGGTGATGCT No Hit 12 212,738 0.15%
CGTGATCTGAGTTCAGACCGGAGTAATCCAGGTCGGTTTCTATCTACTTC No Hit 13 211,909 0.13%
CTTATGAGCATGCCTGTGTTGGGTTGACAGTGAGGGTAATAATGACTTGT No Hit 11 211,317 0.16%
GAAAAATTATAACCAAGCATAATATAGCAAGGACTAACCCCTATACCTTC No Hit 13 206,617 0.13%
CCTGTTCTTGGGTGGGTGTGGGTATAATGCTAAGTTGAGATGATATCATT No Hit 10 199,683 0.21%
CTTTAATTTATTAATGCAAACAGTACCTAACAAACCCACAGGTCCTAAAC No Hit 12 195,469 0.12%
GTTGATTGTAGATATTGGGCTGTTAATTGTCAGTTCAGTGTCTTAATCTG No Hit 9 194,755 0.17%
GTGGGTATAATACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTT No Hit 9 193,599 0.25%
CCTGTTCTTGGGTGGGTGTGGGTATAATACTAAGTTGAGATGATATCATT No Hit 10 192,329 0.16%
CCTATACCTTCTGCATAATGAATTAACTAGAAATAACTTTGCAAGGAGAG No Hit 12 190,447 0.13%
GTGGGTGTTGAGCTTGAACGCTTTCTTAATTGGTGGCTGCTTTTAGGCCT No Hit 10 187,546 0.15%
CGCGTGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGTGGGAGGATCGCT No Hit 11 179,839 0.14%
CTTAGCTTTGGCTCTCCTTGCAAAGTTATTTCTAGTTAATTCATTATGCA No Hit 9 169,489 0.2%
GGGGGAAGGCGCTTTGTGAAGTAGGCCTTATTTCTCTTGTCCTTTCGTAC No Hit 9 163,289 0.13%
GTCAAGTTATTGGATCAATTGAGTATAGTAGTTCGCTTTGACTGGTGAAG No Hit 9 161,582 0.13%
CTAGAGGTGATGTTTTTGGTAAACAGGCGGGGTAAGATTTGCCGAGTTCC No Hit 9 159,679 0.16%
CTAGAAATAACTTTGCAAGGAGAGCCAAAGCTAAGACCCCCGAAACCAGA No Hit 10 154,141 0.12%
CCGTTTCCGACCTGGGCCGGTTCACCCCTCCTTAGGCAACCTGGTGGTCC No Hit 8 142,161 0.13%
GGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGATCAGCA No Hit 6 138,843 0.19%
CCAAACCCACTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTACC No Hit 9 138,583 0.11%
GTCACTGGGCAGGCGGTGCCTCTAATACTGGTGATGCTAGAGGTGATGTT No Hit 8 137,393 0.12%
CTGGGCAGGCGGTGCCTCTAATACTGGTGATGCTAGAGGTGATGTTTTTG No Hit 8 137,158 0.16%
CGCTGTTATCCCTAGGGTAACTTGTTCCGTTGGTCAAGTTATTGGATCAA No Hit 7 134,188 0.13%
GTGGCTATTCACAGGCGCGATCCCACTACTGATCAGCACGGGAGTTTTGA No Hit 7 133,657 0.18%
GGTGGCGCGTGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGTGGGAGGA No Hit 8 124,497 0.14%
CACTATTTTGCTACATAGACGGGTGTGCTCTTTTAGCTGTTCTTAGGTAG No Hit 8 123,044 0.13%
CCCCAAACCCACTCCACCTTACTACCAGACAACCTTAGCCAAACCATTTA No Hit 6 113,439 0.14%
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA No Hit 2 112,896 0.42%
CTTAGTTCAACTTTAAATTTGCCCACAGAACCCTCTAAATCCCCTTGTAA No Hit 6 111,566 0.14%
GGGGGTCTTAGCTTTGGCTCTCCTTGCAAAGTTATTTCTAGTTAATTCAT No Hit 5 106,522 0.15%
GTCTGGTTTCGGGGGTCTTAGCTTTGGCTCTCCTTGCAAAGTTATTTCTA No Hit 6 98,824 0.13%
CTTCTATAGGGTGATAGATTGGTCCAATTGGGTGTGAGGAGTTCAGTTAT No Hit 7 96,244 0.12%
CTTATGGGCATGCCTGTGTTGGGTTGACAGTGAGGGTAATAATGACTTGT No Hit 6 94,327 0.12%
CCCAAATAAAGTATAGGCGATAGAAATTGAAACCTGGCGCAATAGATATA No Hit 5 89,230 0.12%
CTTGACCAACGGAACAAGTTACCCTAGGGATAACAGCGCAATCCTATTCT No Hit 5 86,958 0.13%
CTAACAGTTAAATTTACAAGGGGATTTAGAGGGTTCTGTGGGCAAATTTA No Hit 5 81,232 0.14%
CTCGGAGCAGAACCCAACCTCCGAGCAGTACATGCTAAGACTTCACCAGT No Hit 6 78,378 0.11%
GCTCCGTTTCCGACCTGGGCCGGTTCACCCCTCCTTAGGCAACCTGGTGG No Hit 5 77,168 0.15%
CTCGCTATGTTGCCCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATC No Hit 5 76,656 0.13%
CTGTGGGCAAATTTAAAGTTGAACTAAGATTCTATCTTGGACAACCAGCT No Hit 5 73,528 0.12%
GGTTAGTCCTTGCTATATTATGCTTGGTTATAATTTTTCATCTTTCCCTT No Hit 4 72,556 0.11%
CTTAATCTGACGCAGGCTTATGCGGAGGAGAATGTTTTCATGTTACTTAT No Hit 5 69,982 0.13%
CTCTACAAGGTTTTTTCCTAGTGTCCAAAGAGCTGTTCCTCTTTGGACTA No Hit 4 68,019 0.12%
GTTAAATTTTTTACTCTCTCTACAAGGTTTTTTCCTAGTGTCCAAAGAGC No Hit 4 67,436 0.14%
GCCTTATTTCTCTTGTCCTTTCGTACAGGGAGGAATTTGAAGTAGATAGA No Hit 4 67,121 0.16%
GTGGCGCGTGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCTGGAGGAT No Hit 4 66,105 0.12%
GTTGACAGTGAGGGTAATAATGACTTGTTGGTTGATTGTAGATATTGGGC No Hit 4 63,967 0.14%
CTTCACCAGTCAAAGCGAACTACTATACTCAATTGATCCAATAACTTGAC No Hit 4 58,549 0.13%
GTTGATTGTAGATATTGGGCTGTTAATTGTCAGTTCAGTGTTTTAATCTG No Hit 3 57,524 0.15%
CTCCTCTATCGGGGATGGTCGTCCTCTTCGACCGAGCGCGCAGCTTCGGG No Hit 4 57,003 0.11%
GGGATTTAGAGGGTTCTGTGGGCAAATTTAAAGTTGAACTAAGATTCTAT No Hit 3 57,003 0.1%
TATAATACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTGA No Hit 3 56,996 0.13%
GGCTGCTTTTAGGCCTACTATGGGTGTTAAATTTTTTACTCTCTCTACAA No Hit 3 53,576 0.15%
CCCAACCTCCGAGCAGTACATGCTAAGACTTCACCAGTCAAAGCGAACTA No Hit 4 53,284 0.12%
GTATAATGCTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTG No Hit 3 47,075 0.12%
CGGGGGTCTTAGCTTTGGCTCTCCTTGCAAAGTTATTTCTAGTTAATTCA No Hit 3 46,339 0.13%
GTTAAACATGTGTCACTGGGCAGGCGGTGCCTCTAATACTGGTGATGCTA No Hit 3 44,382 0.11%
GGGGTTAGTCCTTGCTATATTATGCTTGGTTATAATTTTTCATCTTTCCC No Hit 2 43,843 0.11%
GGCTATTCACAGGCGCGATCCCACTACTGATCAGCACGGGAGTTTTGACC No Hit 2 42,090 0.15%
GTTGGGTTGACAGTGAGGGTAATAATGACTTGTTGGTTGATTGTAGATAT No Hit 2 41,367 0.11%
CGCTTGAGCCCAGGAGTTCTGGGCTGTAGTGCGCTATGCCGATCGGGTGT No Hit 2 41,069 0.1%
GCTGTTAATTGTCAGTTCAGTGTCTTAATCTGACGCAGGCTTATGCGGAG No Hit 2 40,019 0.14%
ATAATACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTGAA No Hit 2 39,602 0.18%
CTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTGAAGTAGGC No Hit 2 38,335 0.18%
GTTAATTGTCAGTTCAGTGTTTTAATCTGACGCAGGCTTATGCGGAGGAG No Hit 2 36,217 0.14%
GTTTTTGGTAAACAGGCGGGGTAAGATTTGCCGAGTTCCTTTTACTTTTT No Hit 2 35,451 0.12%
CCTAAAAGCAGCCACCAATTAAGAAAGCGTTCAAGCTCAACACCCACTAC No Hit 2 34,954 0.12%
TGAAAACATTCTCCTCCGCATAAGCCTGCGTCAGATTAAAACACTGAACT No Hit 2 34,947 0.12%
GGGAAGCTCATCAGTGGGGCCACGAGCTGAGTGCGTCCTGTCACTCCACT No Hit 2 34,733 0.12%
GGTTGATTGTAGATATTGGGCTGTTAATTGTCAGTTCAGTGTCTTAATCT No Hit 2 34,505 0.13%
GTGAGGAGTTCAGTTATATGTTTGGGATTTTTTAGGTAGTGGGTGTTGAG No Hit 2 33,509 0.12%
ACTTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATG No Hit 3 32,989 0.15%
CCTTAGCCAAACCATTTACCCAAATAAAGTATAGGCGATAGAAATTGAAA No Hit 2 32,869 0.12%
CCCAACCGAAATTTTTAATGCAGGTTTGGTAGTTTAGGACCTGTGGGTTT No Hit 2 32,648 0.12%
GTAATAATGACTTGTTGGTTGATTGTAGATATTGGGCTGTTAATTGTCAG No Hit 2 32,467 0.12%
ATATAATACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTG No Hit 2 32,228 0.24%
CCTTATTTCTCTTGTCCTTTCGTACAGGGAGGAATTTGAAGTAGATAGAA No Hit 2 32,214 0.13%
GCCCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGAT No Hit 2 31,432 0.11%
CTGGTGATGCTAGAGGTGATGTTTTTGGTAAACAGGCGGGGTAAGATTTG No Hit 2 31,382 0.11%
CCCAAACATATAACTGAACTCCTCACACCCAATTGGACCAATCTATCACC No Hit 2 31,044 0.12%
CCAACACAGGCATGCCCATAAGGAAAGGTTAAAAAAAGTAAAAGGAACTC No Hit 2 30,400 0.11%
CATTCTCCTCCGCATAAGCCTGCGTCAGATTAAGACACTGAACTGACAAT No Hit 2 29,862 0.11%
GTGCGCTATGCCGATCGGGTGTCCGCACTAAGTTCGGCATCAATATGGTG No Hit 2 28,998 0.12%
CCGCACTAAGTTCGGCATCAATATGGTGACCTCCCGGGAGCGGGGGACCA No Hit 2 28,958 0.14%
CTAATGTTAGTATAAGTAACATGAAAACATTCTCCTCCGCATAAGCCTGC No Hit 2 28,325 0.1%
TGGTGACCTCCCGGGAGCGGGGGACCACCAGGTTGCCTAAGGAGGGGTGA No Hit 2 28,063 0.14%
CCCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGATC No Hit 2 27,888 0.13%
CACTAAGTTCGGCATCAATATGGTGACCTCCCGGGAGCGGGGGACCACCA No Hit 2 27,331 0.11%
CGCTATGTTGCCCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCC No Hit 2 27,049 0.1%
GGGATTTTTTAGGTAGTGGGTGTTGAGCTTGAACGCTTTCTTAATTGGTG No Hit 1 26,849 0.12%
ATTGGTTATAATTTTTCATCTTTCCCTTGCGGTACTATATCTATTGCGCC No Hit 2 25,859 0.19%
GGCTGCACCATCGGGATGTCCTGATCCAACATCGAGGTCGTAAACCCTAT No Hit 2 25,712 0.11%
CTTTGCACGGTTAGGGTACCGCGGCCGTTAAACATGTGTCACTGGGCAGG No Hit 2 25,395 0.11%
CGACCATCCCCGATAGAGGAGGACCGGTCTTCGGTCAAGGGTATACGAGT No Hit 2 25,183 0.12%
TGTGGGTATAATACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCT No Hit 1 25,034 0.11%
CTCAATTGATCCAATAACTTGACCAACGGAACAAGTTACCCTAGGGATAA No Hit 2 24,379 0.11%
CCTCGGAGCAGAACCCAACCTCCGAGCAGTACATGCTAAGACTTCACCAG No Hit 1 23,698 0.11%
GGATTTTTTAGGTAGTGGGTGTTGAGCTTGAACGCTTTCTTAATTGGTGG No Hit 1 23,649 0.12%
GTTCCTTTTACTTTTTTTAACCTTTCCTTATGGGCATGCCTGTGTTGGGT No Hit 1 23,023 0.1%
GTGATAGATTGGTCCAATTGGGTGTGAGGAGTTCAGTTATATGTTTGGGA No Hit 1 22,867 0.1%
TTTAAATTTGCCCACAGAACCCTCTAAATCCCCTTGTAAATTTAACTGTT No Hit 1 22,823 0.1%
GGGGATTTAGAGGGTTCTGTGGGCAAATTTAAAGTTGAACTAAGATTCTA No Hit 1 22,819 0.1%
GGGTGATAGATTGGTCCAATTGGGTGTGAGGAGTTCAGTTATATGTTTGG No Hit 1 22,640 0.1%
ATAAGATTTGCCGAGTTCCTTTTACTTTTTTTAACCTTTCCTTATGGGCA No Hit 2 22,020 0.15%
GGCTTATGCGGAGGAGAATGTTTTCATGTTACTTATACTAACATTAGTTC No Hit 1 21,986 0.1%
CTGTCTCTTACTTTTAACCAGTGAAATTGACCTGCCCGTGAAGAGGCGGG No Hit 1 20,247 0.1%
GCTATTCACAGGCGCGATCCCACTACTGATCAGCACGGGAGTTTTGACCT No Hit 1 17,739 0.12%
CTGTGTTGGGTTGACAGTGAGGGTAATAATGACTTGTTGGTTGATTGTAG No Hit 1 17,726 0.1%
GCTCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGAT No Hit 1 17,185 0.12%
ACTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTGAAGTAGG No Hit 1 17,151 0.1%
GTCTGTTCCAAGCTCCGGCAAAGGAGGCATCCGCCGGGCCCCTCCCCGAA No Hit 1 16,802 0.12%
CTCGTATACCCTTGACCGAAGACCGGTCCTCCTCTATCGGGGATGGTCGT No Hit 1 16,776 0.1%
CCGAGTTCCTTTTACTTTTTTTAACCTTTCCTTATGGGCATGCCTGTGTT No Hit 1 16,721 0.1%
CATCAATATGGTGACCTCCCGGGAGCGGGGGACCACCAGGTTGCCTAAGG No Hit 1 15,507 0.11%
CAAACCCACTCCACCCTACTACCAGACAACCTTAGCCAAACCATTTACCC No Hit 1 15,502 0.1%
GTCCGCACTAAGTTCGGCATCAATATGGTGACCTCCCGGGAGCGGGGGAC No Hit 1 15,267 0.11%
TCAGGCTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGATCA No Hit 1 14,689 0.12%
ATGGAGTCTTGGAAGCTTGACTACCCTACGTTCTCCTACAAATGGACCTT No Hit 2 14,480 0.12%
AGGGGTATAATGCTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTT No Hit 1 14,213 0.25%
CTGGAGTGCAGTGGCTATTCACAGGCGCGATCCCACTACTGATCAGCACG No Hit 1 13,698 0.11%
TCCAGGTCGGTTTCTATCTACTTCAAATTCCTCCCTGTACGAAAGGACAA No Hit 1 13,043 0.11%
GCACGGTTAGGGTACCGCGGCCGTTAAACATGTGTCACTGGGCAGGCGGT No Hit 1 12,994 0.11%
CCAGCTATCACCAGGCTCGGTAGGTTTGTCGCCTCTACCTATAAATCTTC No Hit 1 12,710 0.1%
CTGAGGTGGGAGGATCGCTTGAGCCCAGGAGTTCTGGGCTGTAGTGCGCT No Hit 1 12,628 0.11%
CTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATGCCGAACTTAGT No Hit 1 12,518 0.1%
AGGGGGAAGGCGCTTTGTGAAGTAGGCCTTATTTCTCTTGTCCTTTCGTA No Hit 1 12,229 0.21%
TGGCTATTCACAGGCGCGATCCCACTACTGATCAGCACGGGAGTTTTGAC No Hit 1 12,122 0.1%
GTCCTTGCTATATTATGCTTGGTTATAATTTTTCATCTTTCCCTTGCGGT No Hit 1 11,330 0.1%
ATCCGTTTCCGACCTGGGCCGGTTCACCCCTCCTTAGGCAACCTGGTGGT No Hit 1 11,028 0.11%
AGGTATAATGCTAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTG No Hit 1 8,843 0.15%
ATCCAATTGGGTGTGAGGAGTTCAGTTATATGTTTGGGATTTTTTAGGTA No Hit 1 8,194 0.13%
AGGGGAAGGCGCTTTGTGAAGTAGGCCTTATTTCTCTTGTCCTTTCGTAC No Hit 1 7,719 0.13%
ATTAGGCAACCTGGTGGTCCCCCGCTCCCGGGAGGTCACCATATTGATGC No Hit 1 7,451 0.12%
ATAAGTTGAGATGATATCATTTACGGGGGAAGGCGCTTTGTGAAGTAGGC No Hit 1 6,477 0.11%
ATAGAGGTGATGTTTTTGGTAAACAGGCGGGGTAAGATTTGCCGAGTTCC No Hit 1 6,168 0.1%

sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] plotly_4.9.2.1      glue_1.4.2          pander_0.6.3       
 [4] scales_1.1.1        yaml_2.2.1          forcats_0.5.0      
 [7] stringr_1.4.0       dplyr_1.0.2         purrr_0.3.4        
[10] readr_1.4.0         tidyr_1.1.2         tidyverse_1.3.0    
[13] ngsReports_1.6.0    tibble_3.0.4        ggplot2_3.3.2      
[16] BiocGenerics_0.36.0 workflowr_1.6.2    

loaded via a namespace (and not attached):
  [1] colorspace_1.4-1            hwriter_1.3.2              
  [3] ellipsis_0.3.1              rprojroot_1.3-2            
  [5] XVector_0.30.0              GenomicRanges_1.42.0       
  [7] ggdendro_0.1.22             fs_1.5.0                   
  [9] rstudioapi_0.11             farver_2.0.3               
 [11] ggrepel_0.8.2               DT_0.16                    
 [13] fansi_0.4.1                 lubridate_1.7.9            
 [15] xml2_1.3.2                  leaps_3.1                  
 [17] knitr_1.30                  jsonlite_1.7.1             
 [19] Rsamtools_2.6.0             Cairo_1.5-12.2             
 [21] broom_0.7.2                 cluster_2.1.0              
 [23] dbplyr_2.0.0                png_0.1-7                  
 [25] compiler_4.0.3              httr_1.4.2                 
 [27] backports_1.2.0             assertthat_0.2.1           
 [29] Matrix_1.2-18               lazyeval_0.2.2             
 [31] cli_2.1.0                   later_1.1.0.1              
 [33] htmltools_0.5.0             tools_4.0.3                
 [35] gtable_0.3.0                GenomeInfoDbData_1.2.4     
 [37] reshape2_1.4.4              FactoMineR_2.3             
 [39] ShortRead_1.48.0            Rcpp_1.0.5                 
 [41] Biobase_2.50.0              cellranger_1.1.0           
 [43] vctrs_0.3.4                 Biostrings_2.58.0          
 [45] crosstalk_1.1.0.1           xfun_0.19                  
 [47] rvest_0.3.6                 lifecycle_0.2.0            
 [49] zlibbioc_1.36.0             MASS_7.3-53                
 [51] zoo_1.8-8                   hms_0.5.3                  
 [53] promises_1.1.1              MatrixGenerics_1.2.0       
 [55] SummarizedExperiment_1.20.0 RColorBrewer_1.1-2         
 [57] latticeExtra_0.6-29         stringi_1.5.3              
 [59] highr_0.8                   S4Vectors_0.28.0           
 [61] BiocParallel_1.24.0         GenomeInfoDb_1.26.0        
 [63] rlang_0.4.8                 pkgconfig_2.0.3            
 [65] matrixStats_0.57.0          bitops_1.0-6               
 [67] evaluate_0.14               lattice_0.20-41            
 [69] labeling_0.4.2              GenomicAlignments_1.26.0   
 [71] htmlwidgets_1.5.2           tidyselect_1.1.0           
 [73] here_0.1                    plyr_1.8.6                 
 [75] magrittr_1.5                R6_2.5.0                   
 [77] IRanges_2.24.0              generics_0.1.0             
 [79] DelayedArray_0.16.0         DBI_1.1.0                  
 [81] pillar_1.4.6                haven_2.3.1                
 [83] whisker_0.4                 withr_2.3.0                
 [85] scatterplot3d_0.3-41        RCurl_1.98-1.2             
 [87] modelr_0.1.8                crayon_1.3.4               
 [89] rmarkdown_2.5               jpeg_0.1-8.1               
 [91] grid_4.0.3                  readxl_1.3.1               
 [93] data.table_1.13.2           git2r_0.27.1               
 [95] reprex_0.3.0                digest_0.6.27              
 [97] flashClust_1.01-2           httpuv_1.5.4               
 [99] stats4_4.0.3                munsell_0.5.0              
[101] viridisLite_0.3.0